对象检测一直是实用的。我们世界上有很多事情,以至于认识到它们不仅可以增加我们对周围环境的自动知识,而且对于有兴趣开展新业务的人来说也可以很有利润。这些有吸引力的物体之一是车牌(LP)。除了可以使用车牌检测的安全用途外,它还可以用于创建创意业务。随着基于深度学习模型的对象检测方法的开发,适当且全面的数据集变得双重重要。但是,由于频繁使用车牌数据集的商业使用,不仅在伊朗而且在世界范围内也有限。用于检测车牌的最大伊朗数据集具有1,466张图像。此外,识别车牌角色的最大伊朗数据集具有5,000张图像。我们已经准备了一个完整的数据集,其中包括20,967辆汽车图像,以及对整个车牌及其字符的所有检测注释,这对于各种目的都是有用的。此外,字符识别应用程序的车牌图像总数为27,745张图像。
translated by 谷歌翻译
Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based scoring metrics for finding important examples are GraNd and its estimated version, EL2N. In this work, we employ these two metrics for the first time in NLP. We demonstrate that these metrics need to be computed after at least one epoch of fine-tuning and they are not reliable in early steps. Furthermore, we show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it. This paper details adjustments and implementation choices which enable GraNd and EL2N to be applied to NLP.
translated by 谷歌翻译
推荐系统(RSS)旨在模拟和预测用户偏好,同时与诸如兴趣点(POI)的项目进行交互。这些系统面临着几种挑战,例如数据稀疏性,限制了它们的有效性。在本文中,我们通过将社会,地理和时间信息纳入矩阵分解(MF)技术来解决这个问题。为此,我们基于两个因素模拟社会影响:用户之间的相似之处在常见的办理登机手续和它们之间的友谊方面。我们根据明确的友谊网络和用户之间的高支票重叠介绍了两个友谊。我们基于用户的地理活动中心友好算法。结果表明,我们所提出的模型在两个真实的数据集中优于最先进的。更具体地说,我们的消融研究表明,社会模式在精确的@ 10分别在Gowalla和Yelp数据集中提高了我们所提出的POI推荐系统的表现。
translated by 谷歌翻译
计算机视觉在智能运输系统(ITS)和交通监视中发挥了重要作用。除了快速增长的自动化车辆和拥挤的城市外,通过实施深层神经网络的实施,可以使用视频监视基础架构进行自动和高级交通管理系统(ATM)。在这项研究中,我们为实时交通监控提供了一个实用的平台,包括3D车辆/行人检测,速度检测,轨迹估算,拥塞检测以及监视车辆和行人的相互作用,都使用单个CCTV交通摄像头。我们适应了定制的Yolov5深神经网络模型,用于车辆/行人检测和增强的排序跟踪算法。还开发了基于混合卫星的基于混合卫星的逆透视图(SG-IPM)方法,用于摄像机自动校准,从而导致准确的3D对象检测和可视化。我们还根据短期和长期的时间视频数据流开发了层次结构的交通建模解决方案,以了解脆弱道路使用者的交通流量,瓶颈和危险景点。关于现实世界情景和与最先进的比较的几项实验是使用各种交通监控数据集进行的,包括从高速公路,交叉路口和城市地区收集的MIO-TCD,UA-DETRAC和GRAM-RTM,在不同的照明和城市地区天气状况。
translated by 谷歌翻译
Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean test images, yet persistently predicts an attacker-defined label for any sample in the presence of the backdoor trigger. Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain. In this work, we revisit the traditional backdoor threat model and incorporate additional video-related aspects to that model. We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically, leading to highly effective attacks in the video domain. In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain. And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models, where we show that attacking a single modality is enough for achieving a high attack success rate.
translated by 谷歌翻译
Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.
translated by 谷歌翻译
Compared to regular cameras, Dynamic Vision Sensors or Event Cameras can output compact visual data based on a change in the intensity in each pixel location asynchronously. In this paper, we study the application of current image-based SLAM techniques to these novel sensors. To this end, the information in adaptively selected event windows is processed to form motion-compensated images. These images are then used to reconstruct the scene and estimate the 6-DOF pose of the camera. We also propose an inertial version of the event-only pipeline to assess its capabilities. We compare the results of different configurations of the proposed algorithm against the ground truth for sequences of two publicly available event datasets. We also compare the results of the proposed event-inertial pipeline with the state-of-the-art and show it can produce comparable or more accurate results provided the map estimate is reliable.
translated by 谷歌翻译
With Twitter's growth and popularity, a huge number of views are shared by users on various topics, making this platform a valuable information source on various political, social, and economic issues. This paper investigates English tweets on the Russia-Ukraine war to analyze trends reflecting users' opinions and sentiments regarding the conflict. The tweets' positive and negative sentiments are analyzed using a BERT-based model, and the time series associated with the frequency of positive and negative tweets for various countries is calculated. Then, we propose a method based on the neighborhood average for modeling and clustering the time series of countries. The clustering results provide valuable insight into public opinion regarding this conflict. Among other things, we can mention the similar thoughts of users from the United States, Canada, the United Kingdom, and most Western European countries versus the shared views of Eastern European, Scandinavian, Asian, and South American nations toward the conflict.
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
This paper deals with the problem of statistical and system heterogeneity in a cross-silo Federated Learning (FL) framework where there exist a limited number of Consumer Internet of Things (CIoT) devices in a smart building. We propose a novel Graph Signal Processing (GSP)-inspired aggregation rule based on graph filtering dubbed ``G-Fedfilt''. The proposed aggregator enables a structured flow of information based on the graph's topology. This behavior allows capturing the interconnection of CIoT devices and training domain-specific models. The embedded graph filter is equipped with a tunable parameter which enables a continuous trade-off between domain-agnostic and domain-specific FL. In the case of domain-agnostic, it forces G-Fedfilt to act similar to the conventional Federated Averaging (FedAvg) aggregation rule. The proposed G-Fedfilt also enables an intrinsic smooth clustering based on the graph connectivity without explicitly specified which further boosts the personalization of the models in the framework. In addition, the proposed scheme enjoys a communication-efficient time-scheduling to alleviate the system heterogeneity. This is accomplished by adaptively adjusting the amount of training data samples and sparsity of the models' gradients to reduce communication desynchronization and latency. Simulation results show that the proposed G-Fedfilt achieves up to $3.99\% $ better classification accuracy than the conventional FedAvg when concerning model personalization on the statistically heterogeneous local datasets, while it is capable of yielding up to $2.41\%$ higher accuracy than FedAvg in the case of testing the generalization of the models.
translated by 谷歌翻译